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Abstract. In Gold’s framework of inductive inference, the model of par¬ 
tial learning requires the learner to output exactly one correct index for 
the target object and only the target object infinitely often. Since in¬ 
finitely many of the learner’s hypotheses may be incorrect, it is not ob¬ 
vious whether a partial learner can be modified to “approximate” the 
target object. 

Fulk and Jain (Approximate inference and scientific method. Information 
and Computation 114(2):179-191, 1994) introduced a model of approx¬ 
imate learning of recursive functions. The present work extends their 
research and solves an open problem of Fulk and Jain by showing that 
there is a learner which approximates and partially identifies every recur¬ 
sive function by outputting a sequence of hypotheses which, in addition, 
are also almost all finite variants of the target function. 

The subsequent study is dedicated to the question how these findings 
generalise to the learning of r.e. languages from positive data. Here three 
variants of approximate learning will be introduced and investigated with 
respect to the question whether they can be combined with partial learn¬ 
ing. Following the line of Fulk and Jain’s research, further investigations 
provide conditions under which partial language learners can eventually 
output only finite variants of the target language. 


1 Introduction 

Gold [10] considered a learning scenario where the learner is fed with piecewise 
increasing amounts of finite data about a given target language L; at every stage 
where a new input datum is given, the learner makes a conjecture about L. If 
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there is exactly one correct representation of L that the learner always outputs 
after some finite time (assuming that it never stops receiving data about L), then 
the learner is said to have “identified L in the limit.” In this paper, it is assumed 
that all target languages are encoded as recursively enumerable (r.e.) sets of 
natural numbers, and that the learner uses Godel numbers as its hypotheses. 

Gold’s learning paradigm has been used as a basis for a variety of theoretical 
models in subjects such as human language acquisition [18] and the theory of 
scientific inquiry in the philosophy of science [4,17]. This paper is mainly con¬ 
cerned with the partial learning model [19], which retains several features of 
Gold’s original framework - the modelling of learners as recursive functions, the 
use of texts as the mode of data presentation and the restriction of target classes 
to the family of all r.e. sets - while liberalising the learning criterion by only 
requiring the learner to output exactly one hypothesis of the target set infinitely 
often while it must output any other hypothesis only finitely often. It is known 
that partial learning is so powerful that the class of all r.e. languages can be 
partially learnt [19]. 

However, the model of partial learning puts no further constraints on those 
hypotheses that are output only finitely often. In particular, it offers no notion 
of “eventually being correct” or even “approximating” the target object. From a 
philosophical point of view, if partial learning is to be taken seriously as a model 
of language acquisition, then it is quite plausible that learners are capable of 
gradually improving the quality of their hypotheses over time. For instance, if 
the learner M sees a sentence S in the text at some point, then it is conceivable 
that after some finite time, M will only conjecture grammars that generate S. 
This leads one to consider a notion of the learner “approximating” the target 
language. 

The central question in this paper is whether any partial learner can be 
redefined in a way that it approximates the target object and still partially 
learns it. The first results, in the context of partial learning, deal with Fulk and 
Jain’s [5] notion of approximating recursive functions. Fulk and Jain proved the 
existence of a learner that “approximates” every recursive function. This result 
is generalised as follows: partial learners can always be made to approximate 
recursive functions according to their model and, in addition, eventually output 
only finite variants of the target function, that is, they can be designed as BC* 
learners^. This result solves an open question posed by Fulk and Jain, namely 
whether recursive functions can be approximated by BC* learners. Note that 
BC* learning can also, in some sense, be considered a form of approximation, 
as it requires that eventually all of the hypotheses (including those output only 
finitely often) differ from the target object in only finitely many values. It thus 
is interesting to see that partial learning can be combined not only with Fulk 
and Jain’s model of approximation, but also with BC* learning at the same 
time. Note that in this paper, when two learning criteria A and B are said to 
be combinable, it is generally not assumed that the new learner is effectively 
constructed from the H-learner and the H-learner. 

^ BC* is mnemonic for “behaviourally correct with finitely many anomalies” [4]. 
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This raises the question whether partial learners can also be turned into 
approximate learners in the more general case of learning r.e. languages. Unfor¬ 
tunately, Fulk and Jain’s model applies only to learning recursive functions. The 
second contribution is the design of three notions of approximate learning of r.e. 
languages, two of which are directly inspired by Fulk and Jain’s model. It is then 
investigated under which conditions partial learners can be modified to fulfill the 
corresponding constraints of approximate learning. These investigations are also 
extended to partial learners with additional constraints, such as consistency and 
conservativeness. It will be shown that while partial learners can always be con¬ 
structed in a way so that for any given finite set D, their hypotheses will almost 
always agree with the target language on D, the same does not hold if D must 
be a finite variant of a fixed infinite set. Thus trade-offs between certain ap¬ 
proximate learning constraints and partial learning are sometimes unavoidable 
- an observation that perhaps has a broader implication in the philosophy of 
language learning. 

Following the line of Fulk and Jain’s research, conditions are investigated 
under which partial language learners can eventually output only finite variants 
of the target function. While it remains open whether or not partial learners 
for a given BC'*-learnable class can be made iIC'*-learners for this class without 
losing identification power, some natural conditions on a BC* learner M are 
provided under which all classes learnable by M can be learnt by some BC* 
learner N that is at the same time a partial learner. 

Figure 1 summarises the main results of this paper. RECPart and RECAppr- 
oxBC* Part refer respectively to partial learning of recursive functions and ap¬ 
proximate BC* partial learning of recursive functions. The remaining learning 
criteria are abbreviated (see Definitions 3, 4 and 8), and denote learning of classes 
of r.e. languages. An arrow from criterion A to criterion B means that the col¬ 
lection of classes learnable under model A is contained in that learnable under 
model B. Each arrow is labelled with the Corollary/Example/Remark/Theorem 
number (s) that proves (prove) the relationship represented by the arrow. If there 
is no path from A to B, then the collection of classes learnable under model A 
is not contained in that learnable under model B. 


2 Preliminaries 

The notation and terminology from recursion theory adopted in this paper fol¬ 
lows in general the book of Rogers [20]. Background on inductive inference can be 
found in [11]. The symbol N denotes the set of natural numbers, {0,1,2,...}. Let 
■ ■ ■ denote a fixed acceptable numbering [20] of all partial-recursive 
functions over N. Given a set S', S* denotes the set of all finite sequences in 
S. Wherever no confusion may arise, S will also denote its own characteristic 
function, that is, for all a; G N, S(a;) = 1 if x G S and S{x) = 0 otherwise. 
One defines the e-th r.e. set We as dom{(pe) and the e-th canonical finite set by 
choosing De such that J^xeDe paper fixes a one-one padding func¬ 

tion pad with Wpad{e,d) = We for all e, d. Furthermore, (x, y) denotes Cantor’s 
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Thm 29 Prop 10 

Approx <- ApproxPart <- ApproxBC* Part 



Fig. 1. Learning hierarchy 


pairing function, given by {x,y) = + y){x + y + 1) + y. A triple {x,y,z) 

denotes {{x,y),z). The notation r]{x) means that r]{x) is defined, and r]{x) f 
means that ri{x) is undefined. The notation ipe{x)'\ means that (feix) remains 
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undefined and Lpe^s{x)\, means that ipe{x) is defined within s steps, that is, the 
computation of ipe{x) halts within s steps. K denotes the halting problem, that 
is, K = {x : (px{x) J,}. For any r.e. set A, As denotes the sth approximation of 
A; it is assumed that for all s, — As| < 1 and Ag C {0,..., s}. 

For any cr, t C (N U {#})*, cr ^ t if and only if cr is a prefix of r, ct A r if 
and only if cr is a proper prefix of t, and u(n) denotes the element in the nth 
position of a, starting from n = 0. For each a ^ e, a' denotes the string obtained 
from cr by deleting the last symbol of cr. The concatenation of two strings cr and 
T shall be denoted by cror; for convenience, and whenever there is no possibility 
of confusion, this is occasionally denoted by err. Let cr[n] denote the sequence 
cr(0) o cr(l) o ... o cr(n — 1). The length of cr is denoted by |cr|. 


3 Learning 

The basic learning paradigms studied in the present paper are behaviourally 
correct learning [2,3] and partial learning [19]. These learning models assume 
that the learner is presented with just positive examples of the target language, 
and that the learner is fed with a finite amount of data at every stage. They are 
modifications of the model of explanatory learning (or “learning in the limit”), 
first introduced by Gold [10], in which the learner must output in the limit a 
single correct representation h of the target language L; if L is an r.e. set, then h is 
usually an r.e. index of L with respect to the standard numbering Wq, Wi , W 2 , ■ ■ ■ 
of all r.e. sets. Barzdins [2] and Case [3] considered the more powerful model of 
behaviourally correct learning, whereby the learner must almost always output 
a correct hypothesis of the input set, but some of the correct hypotheses may 
be syntactically distinct. Case and Smith [4] also introduced a less stringent 
variant of BC learning of recursive functions, BC* learning, which only requires 
the learner to output in the limit finite variants of the target recursive function. 
Still more general is the criterion of partial learning that Osherson, Stob and 
Weinstein [19] defined; in this model, the learner must output exactly one correct 
index of the input set infinitely often and output any other conjecture only 
finitely often. 

One can also impose constraints on the quality of a learner’s hypotheses. 
For example, Angluin [1] introduced the notion of consistency^ which is the 
requirement that the learner’s hypotheses must enumerate at least all the data 
seen up to the current stage. This seems to be a fairly natural demand on the 
learner, for it only requires that the learner’s conjectures never contradict the 
available data on the target language. Angluin [1] also introduced the learning 
constraint of conservativeness; intuitively, a conservative learner never makes 
a mind change unless its prior conjecture does not enumerate all the current 
data. A further constraint proposed by Osherson, Stob and Weinstein [18] is 
confidence, according to which the learner must converge on any (even non r.e.) 
text. These three learning criteria have since been adapted to the partial learning 
model [7,8]. 
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Lange and Zeugmann [15] showed that learning in the limit is less powerful 
if the hypothesis space of the learner is restricted to the target class. It would 
thus be quite natural to ask whether this constraint on the hypothesis space of 
the learner has a similar effect on partial learning or on approximate learning. 
For this purpose, one distinguishes between class-comprising learning and class¬ 
preserving learning [15]. If the learner M only conjectures languages that it 
can successfully learn, then M is said to be prudent [18]. The learning criteria 
discussed so far (and, where applicable, their partial learning analogues) are 
formally introduced below. 

Definition 1. [21] M is said to class-comprisingly learn C if it learns C with 
respect to a hypothesis space {Hq, Hi, H 2 , ■ ■ ■}, where Ho,Hi,H 2 , ■ ■ ■ are r.e. 
sets, such that C C {Hq, Hi,H 2 , ■ ■.}. 

Definition 2. [21] M is said to class-preservingly {ClsPresv) learn C if it learns 
C with respect to a hypothesis space {Hq, Hi, H 2 , ■ ■ ■}, where Hq, Hi, H 2 , ■ ■ ■ are 
r.e. sets, such that C = {Hq, Hi,H 2 , ■ ■ 


Throughout this paper, successful learning with respect to a class C will generally 
refer to class-comprising learning unless specified otherwise. 

The learning criteria discussed so far (and, where applicable, their partial 
learning analogues) are formally introduced below. 

Let C be a class of r.e. sets. Throughout this paper, the mode of data pre¬ 
sentation is that of a text, by which is meant an infinite sequence of natural 
numbers and the ^ symbol. Formally, a text for some L in C is a map 
Tl : N ^ N U {#} such that L = range{TL); here, T^ln] denotes the sequence 
Tl{0) o Tl(1) o ... o TL{n — 1) and the range of a text T, denoted range{T), is 
the set of numbers occurring in T. Analogously, for a finite sequence a, range{a) 
is the set of numbers occurring in cr. A text, in other words, is a presentation 
of positive data from the target set. A learner, denoted by M in the following 
definitions, is a recursive function mapping (N U {#})* into N.M may also be 
equipped with an oracle. In this case, a learner that has access to oracle A is an 
A-recursive function mapping (N U {#})* into N. 

Definition 3. (i) [19] M partially (Part) learns C if, for every L in C and 

each text Tl for L, there is exactly one index e such that M(Ti[A:]) = e 
for infinitely many k; furthermore, if M outputs e infinitely often on Tl, 
then L = We- 

(ii) [3] M behaviourally correctly (BC) learns C if, for every L in C and each 
text Tl for L, there is a number n for which L = WM(TL[j]) whenever 
j > n. 

(ill) [1] M is consistent {Cons) if for all cr S (N U {#})*, range{a) C WM(cr)- 

(iv) [1] For any text T, M is consistent on T if range{T[n]) C WM(T[n]) for all 
n > 0. 

(v) [8] M is said to consistently partially {ConsPart) learn C if it partially 
learns C from text and is consistent. 
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(vi) [7] M is said to conservatively partially (ConsvPart) learn C if it partially 
learns C and outputs on each text for every L in C exactly one index e 
with L C We- 

(vii) [8] M is said to confidently partially {ConfPart) learn C if it partially learns 
C from text and outputs on every infinite sequence (including sequences 
that are not texts for any member of C) exactly one index infinitely often, 
(viii) [4] M is said to behaviourally correctly learn C with at most a anomalies 
{BC°‘) iff for every L € C and each text Tl for L, there is a number n for 
which \{WM{TLlj]) - L)U{L- WM{TLlj]))\ ^ ® whenever j > n. 

(ix) [4] M is said to behaviourally correctly learn C with finitely many anomalies 
(BC*) iff for every L G C and each text for L, there is a number n for 
which \{WM{TLlj]) - L)U{L- WM{TL[j]))\ < oo whenever j > n. 

This paper will also consider combinations of different learning criteria; for learn¬ 
ing criteria Ai,, An, a class C is said to be ... A„-learnable iff there is a 
learner M such that M Ai-learns C for alH G {1,..., n}. 

4 Approximate Learning of Functions 

Fulk and Jain [5] proposed a mathematically rigorous definition of approximate 
inference, a notion originally motivated by studies in the philosophy of science. 

Definition 4. [5] An approximate {Approx) learner outputs on the graph of a 
function / a sequence of hypotheses such that there is a sequence Sq,Si, ... of 
sets satisfying the following conditions: 

(a) The S'„ form an ascending sequence of sets such that their union is the set 
of all natural numbers; 

(b) There are infinitely many n such that — Sn is infinite; 

(c) The n-th hypothesis is correct on all x G Sn but nothing is said about the 

X ^ Sn. 

The next proposition simplifies this set of conditions. 

Proposition 5. M Approx learns a recursive function f iff the following con¬ 
ditions hold: 

(d) For all x and almost all n, M’s n-th hypothesis is correct at x; 

(e) There is an infinite set S such that for almost all n and all x G S, M’s n-th 
hypothesis is correct at x. 

Proof. If one has (a), (b), (c), then the set S is just the first set which is 
infinite and the other conditions follow. 

If one has (d) and (e) and one distinguishes two cases: If n is so small that the 
n-th and all subsequent hypotheses are not yet correct on S then one lets Sn = ^ 
else one defines that Sn contains all x < n such that each m-th hypothesis with 
m > n is correct on x plus half of those members of S which are not in any Sm 
with m < n. So the trick is just not to put all members of S at one step into 
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some Sn but just to put at each step which is applicable an infinite new amount 
while still another infinite amount remains outside S'„ to be put later. | 

Fulk and Jain interpreted their notion of approximation as a process in scientihc 
inference whereby physicists take the limit of the average result of a sequence of 
experiments. Their result that the class of recursive functions is approximately 
learnable seems to justify this view. 

Theorem 6 (Fulk and Jain [5]). There is a learner M that Approx learns 
every recursive function. 

The following theorem answers an open question posed by Fulk and Jain [5] on 
whether the class of recursive functions has a learner which outputs a sequence 
of hypotheses that approximates the function to be learnt and almost always 
differs from the target only on finitely many places. 

Theorem 7. There is a learner M which learns the class of all recursive func¬ 
tions such that (i) M is a BC* learner, (ii) M is a partial learner and (Hi) M 
is an approximate learner. 

Proof. Let tpoT'fpi, ■. ■ be an enumeration of all recursive functions and some 
partial ones such that in every step s there is exactly one pair (e, x) for which 
'tpe{x) becomes defined at step s and this pair satisfies in addition that tpeiv) is 
already defined by step s for all y < x. Furthermore, a function ipe is said to 
make progress on cr at step s iff V’e(a^) becomes defined at step s and x S dom{a) 
and ipeiy) = c’(y) for all y < x. 

Now one defines for every a a partial-recursive function as follows: 

— ’de.aix) = cr(x) for all X S dom{a); 

— Let et = e; 

— Inductively for all s > t, if some index d < Cg makes progress on a at step 
s -I- 1 then let Cg+i = d else let Cg+i = e*; 

— For each value x ^ dom{a), if there is a step s > t x for which 'ipe^.six) is 
defined then i9e^,j(x) takes this value for the least such step s, else ite.aix) 
remains undefined. 

The learner M, now to be constructed, uses these functions as hypothesis space; 
on input r, M outputs the index of de.a for the unique e and shortest prefix a 
of T such that the following three conditions are satisfied at some time t: 

— t is the first time such that t > |t| and some function makes progress on t; 

— fje is that function which makes progress at r; 

— for every d < e, did not make progress on t at any s G {|cr|,..., t} and 
either i)d,\a\ is inconsistent with cr or 'fid,\a\{x) is undefined for at least one 
X G dom{a). 

For finitely many strings t there might not be any such function tie,a, as r 
is required to be longer than the largest value up to which some function has 
made progress at time |t|, which can be guaranteed only for almost all r. For 



Combining Models of Approximation with Partial Learning 


9 


these finitely many exceptions, M outputs a default hypothesis, e.g., for the 
everywhere undefined function. Now the three conditions (i), (ii) and (iii) of M 
are verified. For this, let ipd be the function to be learnt, note that ipd is total. 

Condition (i): M is a BC* learner. Let d be the least index of the function ipd 
to be learnt and let u be the last step where some 'ipe with e < d makes progress 
on ipd- Then every t < ipd with |r| > m + 1 satisfies that first M(r) conjectures 
a function with e > d and \a\ > m + 1 and cr 'Pd and second that almost 
all Cs used in the definition of 'de.o- are equal to d; thus the function computed 
is a finite variant of pd and M is a BC* learner. 

Condition (ii): M is a partial learner. Let be the list of all times 

where pd makes progress on itself with u < tg < ti < .... Note that whenever 
T Pi pd and |t| = tk for some k then the conjecture made by M(t) satisfies 
e = d and |cr| = u + 1. As none of these conjectures make progress from step u + 1 
onwards on pd, they also do not make progress on a after step \a\ and de,a = Pd] 
hence the learner outputs some index for pd infinitely often. Furthermore, all 
other indices "de.o- are output only finitely often: if e < d then pe makes no 
progress on the target function pd after step u; if e > d then the length of a 
depends on the prior progress of pd on itself, and if |t| > tk then \a\ > tk- 

Condition (iii): M is an approximate learner. Conditions (d) and (e) in Propo¬ 
sition 5 are used. Now it is shown that, for all t P pd with tk < |t| < tk+i, the 
hypothesis de,a issued by M(r) is correct on the set {to, ii, ■ • If |t| = tk then 
the hypothesis is correct everywhere as shown under condition (ii). So assume 
that e > d. Then |r| > tk and |(t| > tk, hence de,a{x) = pd{x) for all x < tk- Fur¬ 
thermore, as pd makes progress on a in step tk+i and as no pc with c < d makes 
progress on a beyond step |cr|, it follows that the Cg defined in the algorithm of 
de,a all satisfy Cg = d for s > tfc+i; hence de,a{x) = pdpP) for all x > tk+i- I 

5 Approximate Learning of Languages 

This section proposes three notions of approximation in language learning. The 
first two notions, approximate learning and weak approximate learning, are adap¬ 
tations of the set of conditions for approximately learning recursive functions 
given in Proposition 5. Recall that a set R is a finite variant of a set W iff there 
is an X such that for all y > x it holds that V{y) = W{y). 

Definition 8. Let S' be a class of languages. S is approximately (Approx) learn- 
able iff there is a learner M such that for every language L € S there is an infinite 
set W such that for all texts T and all finite variants V oiW and almost all hy¬ 
potheses id of M on T, idnC = inC. S is weakly approximately (WeakApprox) 
learnable iff there is a learner M such that for every language L G S and for 
every text T for L there is an infinite set W such that for all finite variants V 
of W and almost all hypotheses id of M on T, id n L = L fl L. S is finitely 
approximately (FinApprox) learnable iff there is a learner M such that for every 
language L G S', all texts T for L, and any finite set D, it holds that for almost 
all hypotheses id of Ad oiiT,Hr\D = LCD. 
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Remark 9. Jain, Martin and Stephan [13] defined a partial-recursive function 
C to be an In-classifier for a class S of languages if, roughly speaking, for every 
L € S, every text T for L, every finite set D, and almost all n, C on T\n] will 
correctly “classify” all x € -D as either belonging to L or not belonging to L. 
A learner M that FinApprox learns a class S may be translated into a total 
In-classifier for S', and vice versa. 

Approximate learning requires, for each target language, the existence of a set W 
suitable for all texts, while in weakly approximate learning the set W may depend 
on T. In the weakest notion, finitely approximate learning, on any text T for a 
target language L the learner is only required to be almost always correct on any 
finite set. As will be seen later, this model is so powerful that the whole class of 
r.e. sets can be finitely approximated by a partial learner. The following results 
illustrate the models of approximate and weakly approximate learning. They 
establish that, in contrast to the function learning case, approximate language 
learnability does not imply BC* learnability. BC* learnability does not imply 
approximate learnability either, but weakly approximate learning is powerful 
enough to cover all BC* learnable classes. 

Proposition 10. If there is an infinite r.e. set W such that all members of the 
class contain W then the class is Approx learnable. 

Proof. The learner for this just conjectures range{a) U IT on any input cr. | 

Thus approximate learning does, for languages, not imply BC* learning. ® Note 
that for infinite coinfinite r.e. sets W, the class of all r.e. supersets of W is not 
BC* learnable. The next result is the mirror image of the previous result by 
just considering a learner which conjectures the range of the data seen so far; 
for each set L in the class the infinite set S in item (e) of Proposition 5 is just 
the complement of L. 

Proposition 11. If a class C consists only of coinfinite r.e. sets then C is 
Approx learnable. 

While the class of all coinhnite r.e. sets can be approximated, this is not true 
for the class of all cofinite sets. 

Proposition 12. The class of all cofinite sets is ConsWeakApproxBC*Part 
learnable but neither Approx learnable nor BC"’ learnable for any n. 

Proof. To make a ConsWeakApproxBC*Part learner, define P as follows. On 
input cr, P determines whether or not range{a) — range{a') = {x} for some 
X € N. If range{a) — range{a') is either empty or equal to {#}, then P re¬ 
peats its last conjecture {P{<j')) if a' e; if u' = e, then P outputs a default 
hypothesis, say a canonical index for N. If range{a) — range{u') = {x} for 
some X € N, then P determines the maximum -w (if such a cu exists) such that 

® For example, take the class of all supersets of the set of even numbers. 
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w ^ range{a) fl {0,... ,x}, and outputs a canonical index for the cofinite set 
{range{a) fl {0,... ,w}) U {z : z > tu}. If no such w exists, then P outputs a 
canonical index for N. 

Given any text T for a cofinite set L ^ N such that w = max(N — L), 
there is a sufficiently large s such that range{T[s' + 1]) fl {0,..., tc} = L fl 
{0, for all s' > s. Furthermore, there are infinitely many n > s such 

that range(T[n + 1]) — range(T[n\) = {x) for some number x > w, and on 
each of these text prefixes T[n + 1], P will output a canonical index for L. P 
is also consistent by construction. Thus P consistently partially learns L. On 
any text T' for N, there are infinitely many stages n at which range{T'[n + 1]) 
contains all numbers less than x for some x, and therefore P will output a 
canonical index for N infinitely often. To see that P is also a WeakApprox learner, 
observe that if T" is a text for a cofinite set L, then T" contains an infinite 
subsequence T"{no),T"{ni),T"{n 2 ),... of numbers such that no < ni < n 2 < 
... and T"{no) < T"{m) < T"{n 2 ) < ..., which means that for almost all n, 
Wp(T"[n]) contains the infinite set {r"(no), r"(ni), T"(n 2 ),...}. Hence P is a 
WeakApprox learner. Note that P is also a BC* learner as it always outputs 
cofinite sets. 

Now assume for a contradiction that for some n and learner Q, Q PC'" learns 
the class of all cofinite sets. Since Q PC" learns N, there is a ct G (NU{#})* such 
that for all t G (N U {#})*, |N — HQ(o.r)| < n. Now choose some cofinite L such 
that range{a) C L and |N —P| > 2n + l. Since Q must PC" learn L, there exists 
some d G (P U {#})* such that \LAWQt^c! 9 )\ < n. But |N — P| — |N — WQ(^cre) \ < 
|LAIFQ(CTe)| < n, and so by the definition of ct, |N — P| < n + |N — WQ(^„g) \ < 
n + n = 2n, contradicting the definition of L. Therefore the class of all cofinite 
sets has no PC'" learner for any n. 

Assume now that the set L to be learnt is approximated with parameter set 
W. Given an approximate learner M for this class, one can construct inductively 
a text T such that either the text is for some set L — {ru} and it conjectures 
almost always that w is in the set to be learnt or the text is for L while there 
are infinitely many conjectures which do not contain IF as a subset. 

The idea is to construct the text T step by step by starting in (a) below and 
by alternating between (a) and (b) as needed: 

(a) Select a w G P fl IF not contained in the part of the text constructed so 
far and add to the part of the text the elements of P — {m} in ascending order 
until the learner M on the so far constructed initial segment conjectures a set 
not containing w; 

(b) Append to the so far constructed part of the text all elements of P up to 
the element w (inclusively) and go back to step (a). 

This gives then a text T with the desired properties: if the learner eventually 
stays in (a) forever, it is wrong on w considered when it the last time goes into 
(a); if the learner goes to (b) infinitely often, the text T is for P while the learner 
M conjectures infinitely often sets which are not supersets of IF. Thus there is 
no approximate learner for the class of all cofinite sets. | 

The following result shows that weak approximate learning is quite powerful. 
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Theorem 13. The class of all infinite sets is Cons Weak Approx learnable. 

Proof. Consider the learner M which conjectures on input a the set 

WM(a) = range{a) U {x : Vy G range(a) [x > y\} 

and consider any text T for an infinite set. Let S' = {a; G range(T): when 
X appears first in T, no larger datum of T has been seen so far}. Note that 
the set S is infinite. Now all conjectures M{T[n]) are a superset of S: if an 
X G S has not yet appeared in T[n\ then all members of range{T[n\) are smaller 
than X and x G WM(T[n\) else x has already appeared in T[n] and is therefore 
also in range{T\n\). Furthermore, if x ^ range(T) then almost all n satisfy 
max(ran( 7 e(T[n])) > x and therefore x ^ WM(T[n\), thus for every x almost all 
hypotheses WM(T[n\) are correct at x. | 

Unfortunately, the weakly approximate learning property of any class of infinite 
sets may be lost if finite sets are added to the target class. 

Proposition 14. Gold’s class consisting of the set of natural numbers and all 
sets {0,1,..., m} is not WeakApprox learnable. 

Proof. Make a text T where T(0) = 0 and iff the n-th hypothesis of the learner 
contains T{n) + 1 then T{n + 1) = T{n) else T{n + 1) = T{n) + 1. 

In the case that the text T is for a finite set with maximum m then T(n) = m 
for almost all n and the n-th hypothesis contains m -|- 1 for almost all n; thus 
the approximations are in the limit false at rn -I- 1. 

In the case that the text T is for the set of all natural numbers then consider 
any m > 0 and consider the first n such that T(n -I- 1) = m. Then the n-th 
hypothesis does not contain m. Therefore, one can conclude that for every m 
there is an n > m such that the n-th hypothesis is conjecturing m not to be 
in the set to be learnt although the set to be learnt is the set of all natural 
numbers. In particular there is no infinite set on which from some time on all 
approximations are correct. 

Thus the class considered is not weakly approximately learnable. | 

It may be observed that in the proof of Theorem 13, the parameter sets S with 
respect to which the learner M approximates the class of all infinite sets may 
not necessarily be r.e. (or be of any fixed Turing degree). This motivates the 
question of whether or not the class of all infinite sets is still weakly approxi¬ 
mately learnable if one restricts the class of parameter sets in Definition 8 to 
some countable family. 

Definition 15. For any sets L and W, where W is infinite, and any text T for 
L, say that a recursive learner M weakly approximately (WeakApprox) learns 
L via W on T iS for all finite variants V of IF, it holds that for almost all 
hypotheses H of M on T, H CiV = L CiV. For any class W of infinite sets, a 
class S of sets is weakly approximately (WeakApprox) learnable via W iff there 
is a recursive learner M such that for every L G S and every text T for L, M 
WeakApprox learns L via some IF G W on T. 
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Proposition 16. For any countable class W of infinite sets, the class of all 
cofinite sets is not WeakApprox learnable via W. 


Proof. Suppose M is a recursive learner that weakly approximately learns all 
cofinite sets via some countable class W of infinite sets. First, note that there 
exist (T € N* and V G W such that for all r € N*, V C WM{aT)- For, assuming 
otherwise, one can build a text T for N as follows. Let Vq, Vi, V 2 , • • ■ be a one-one 
enumeration of W, and set Tq = e, where Tg denotes the text prefix built until 
stage s. Let mg be the minimum number not contained in range{Tg), and find 
strings r/o, r/i,..., such that for all i G { 0 ,..., s}, Vi % WM(T,om,r^o-m)’ 
assumption, such strings 770 , ? 7 i,..., must exist. Let T = lim^ Tg. T is a text 
for N; furthermore, for any Vi G W, Vi % bFM(T[s-i-i]) for infinitely many s, so 
that M does not weakly approximately learn N via Vi on T. 

Now fix CT G N* and V € W such that for all r G N*, P C LLmIctt)- As 
V is infinite, one can choose some w € V — range{a). Let T' be a text for 
N — {w} that extends a. Then M conjectures a set containing w on almost 
all text prefixes of T', which shows that it cannot weakly approximately learn 
N — {w}. In conclusion, the class of all cofinite sets is not weakly approximately 
learnable via W. | 


Theorem 17. If C is BC* learnable then C is WeakApprox learnable. 


Proof. By Theorem 13, there is a learner M that weakly approximates the class 
of all infinite sets. Let O be a BC* learner for C. Now the new learner N is given 
as follows: On input a, N{a) outputs an index of the following set which first 
enumerates range{a) and then searches for some r that satisfies the following 
conditions: (1) range{T) = range(a); (2) |t| = 2 * \range(a)\; (3) Wo{t#^) 
enumerates at least |cr| many elements for all s < \a\. If all three conditions are 
met then the set contains also all elements of IFm(ct)- If L G C is finite then 
for every r of length 2 > 1 = |L| with range L, the learner outputs on some input 
a finite set with Cr many elements. As there are only finitely many such r, 
there is an upper bound t of all Cr and St- Then it follows from the construction 
that the learner N on any input a with range{a) = L and \a\ > t outputs a 
hypothesis for the set L, as the corresponding r cannot be found. Thus N weakly 
approximately learns L. 

If L G C is infinite then there is a locking sequence 7 G L* for L such that 
0{”fr]) conjectures an infinite set whenever 77 G L*. It follows for all a with 
range^j) C range(a) and \range(a)\ > jyj that N{a) considers also a r which is 
an extension of 7 in its algorithm and which therefore meets all three conditions, 
thus N{a) will conjecture a set consisting of the union of range{a) and IFm(<t) • 
As adding range(a) to the hypothesis IFm(<t) cannot make IPAr(CT) to be incorrect 
at any x where IFM(cr) is correct, it follows that also N is weak approximately 
learning L. Thus, by case distinction, A^ is a weak approximate learner for C. | 
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6 Combining Partial Language Learning With Variants 
of Approximate Learning 

This section is concerned with the question whether partial learners can always 
be modified to approximate the target language in the models introduced above. 

6.1 Finitely Approximate Learning 

The first results demonstrate the power of the model of finitely approximate 
learning: there is a partial learner that finitely approximates every r.e. language. 

Theorem 18. The class of all r.e. sets is FinApproxPart learnable. 

Proof. Let Mi be a partial learner of all r.e. sets. Define a learner M 2 as follows. 
Given a text T, let e„ = Mi{T[n + l]) for all n. On input T[n + 1], M 2 determines 
the finite set D = range(T[n + 1]) 0 {0,..., m}, where m is the minimum m < n 
with Cm = e„. M 2 then outputs a canonical index for D U (lTe„ n {x : x > m}). 

Suppose T is a text for some r.e. set L. Then there is a least I such that Mi 
on T outputs ei inhnitely often and 114, = L. Furthermore, there is a least I' such 
that for all I" > I', = range(T[l" + 1]) n {0,..., 1} = L O {0,..., I}. Hence 

M 2 will output a canonical index for L = U (ITei O {x : x > ^}) inhnitely 
often. On the other hand, since, for every h with eu ^ e/ and en 7 ^ ew for all 
h' < h, Ml outputs eh only hnitely often, M 2 will conjecture sets of the form 
D' U {We^ n {x : X > h}) only hnitely often. Thus M 2 partially learns L. 

To see that M 2 is also a hnitely approximate learner, consider any number 
X. Suppose that Mi on T outputs exactly one index e inhnitely often; further, 
114 = L and j is the least index such that Cj = e. Let s be sufhciently large so that 
for all s' > s, range{T[s' + 1]) 0 {0,..., max({x, j})} = Lfl {0,..., max({x, j})}. 
First, assume that Mi outputs only hnitely many distinct indices on T. It follows 
that Ml on T converges to e. Thus M 2 almost always outputs a canonical index 
for (Ln{0,..., j})yj{Wejl^{y ■ y > j}), and so it approximately learns L. Second, 
assume that Mi outputs inhnitely many distinct indices on T. Let di,... ,dx be 
the hrst x conjectures of Mi that are pairwise distinct and are not equal to e. 
There is a stage t > s large enough so that e*/ ^ {di,... ,dx} for all t' > t. 
Consequently, whenever t' > t, M 2 on T\t' + 1] will conjecture a set W such that 
ITn {0,..., x} = Ln {0,..., x}. This establishes that M 2 hnitely approximately 
learns any r.e. set. | 

It may be observed in the proof of Theorem 18 that if Mi is a conhdent par¬ 
tial learner of some class C, then M 2 conhdently partially as well as hnitely 
approximately learns C. This observation leads to the next theorem. 

Theorem 19. If C is ConfPart learnable, then C is FinApproxConfPart learn¬ 
able. 

Gao, Jain and Stephan [7] showed that consistently partial learners exist for all 
and only the subclasses of uniformly recursive families; the next theorem shows 
that such learners can even be hnitely approximate at the same time, in addition 
to being prudent. 
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Theorem 20 . If C is a uniformly recursive family, then C is FinApproxCons- 
Part learnable by a prudent learner. 

Proof. Let C = {Lg, Li, L 2 ,..., } be a uniformly recursive family. On text T, 
define M at each stage s as follows: 

If there are a; C N and t G {0,1,..., s} such that 

— range{T[s + 1]) — range{T[s]) = {a;}, 

— range{T[s + 1]) C Lj U {ff) and 

— range{T[s + 1]) O {0 ,..., x} = Li n {0 ,... ,x} 

Then M outputs the least such i 

Else M outputs a canonical index for range{T[s + 1]) — {#}. 

The consistency of M follows directly by construction. If T is a text for a finite set 
then the “Else-Case” will apply almost always and M converges to a canonical 
index for range{T). Now consider that T is a text for some infinite set Lm G C 
and m is the least index of itself. Let t be large enough so that for all t' > t, all 
X G L — range(T[t + 1]) — {#} and all j < m, Lj n {0,..., a;} 7 ^: range{T[t' + 1]) 
n {0,..., a;}. There are infinitely many stages s > max({t, m}) at which T(s) ^ 
range{T[s]) U {ff} and range{T\s + 1]) fl {0,..., T{s)} = L fl {0,..., T{s)}. At 
each of these stages, M will conjecture Lm- Thus M conjectures Lm infinitely 
often. Furthermore, for every x there is some Sx such that for all j/ G L — 
range(T[sx + 1]), it holds that y > x. Thus whenever s' > Sx, M’s conjecture 
on T\s' + 1] agrees with L on {0,..., x}. M is therefore a finitely approximate 
learner, implying that it never conjectures any incorrect index infinitely often. | 

Proposition 20 and [ 8 , Theorem 18] together give the following corollary. 

Corollary 21 . IfC is ConsPart learnable, thenC is FinApproxConsPart learn¬ 
able by a prudent learner. 

The following result shows that also conservative partial learning may always be 
combined with finitely approximate learning. 

Theorem 22. If C is ConsvPart learnable, then C is FinApproxConsvPart 
learnable. 

Proof. Let Mi be a ConsvPart learner for C, and suppose that Mi outputs the 
sequence of conjectures Cq, ei,... on some given text T. The construction of a 
new learner M 2 is similar to that in Theorem 18; however, one has to ensure 
that M 2 does not output more than one index that is either equal to or a proper 
superset of the target language. On input T[s + 1], define M 2 (T[s + l]) as follows. 

1. If range{T[s + lj) C {^} then output a canonical index for 0 else go to 2. 

2. Let m < s be the least number such that Cm = Cg. If We,,s n {0,..., m} = 
range{T[s + 1]) O {0,... ,m} = D then output a canonical index for D U 
(We„, n {x : x > m}) else go to 3. 

3. If s > 1 then output M 2 (T[s]) else output a canonical index for 0. 
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Suppose that T is a text for some L G C. Without loss of generality, assume 
that L 7 ^ 0; if L = 0, then M 2 will always output a canonical index for 0. Mi 
on T outputs exactly one index eh infinitely often, where We^ = L and eh> ^ eu 
for all h' < h. Let s be the least stage at which range{T[s + 1]) fl {0,..., /i} = 
Ln{0 ,.. .,h} = We^,sn{0 ,... ,h}. Then for all s' > s such that Cg' = e/j, step 2. 
will apply, so that M 2 outputs a canonical index g for (Ln{0 ,..., h})yj{Weh^{x : 
X > h}) = L. Since there are infinitely many such s', M 2 will output g infinitely 
often. Consider any other set of the form F U {We, n {x : a; > I}) that M 2 may 
conjecture at some stage t, where I ^ h and e;/ ^ ei for all I' < 1. By construction, 
F is equal to lTe,,tn{0 ,... ,1}. Thus FL){We, ri{x : x > 1}) C Wg ,, and so by the 
partial conservativeness of Mi, L % F U {Wg, n {x : a; > I}). If M 2 conjectures 
some set of the form G U (IFe^ fl {a; : a; > h}), where G ^ L H {0,..., h}, then 
there is some y G L — {G U {Wg,^ fl {x : a; > h})), and so L ^ GU {Wg,^ C {x : 
X > h}). Furthermore, L % Therefore M 2 outputs exactly one index for a 
set that contains L, and M 2 outputs this index infinitely often. To show that 
M 2 outputs any incorrect index only finitely often, it is enough to show that it 
finitely approximately learns L. 

Consider any x. If Mi on T outputs only finitely many distinct indices, then 
one can argue as in Theorem 18 that M 2 converges on T to g. Suppose that Mi 
on T outputs infinitely many distinct indices. Let s be the least stage at which 
ran( 7 e(r[s + l])n{0,..., x} = Ln{0,..., x}. Let di,... ,dxhe x pairwise distinct 
indices of Mi on T, none of which is equal to et- Then there is a least stage 
t > s such that M 2 (r[t + I]) = g and for all t' > t, Cf ^ {di ,..., d^}- Thus on 
any T[t' + 1] with t' > t, M 2 either outputs g or conjectures a set W such that 
IF n {0,..., x} = L n {0,..., x}. Therefore M 2 is both a finitely approximate 
and a conservatively partial learner of C. | 

Jain, Stephan and Ye [12] proved that for uniformly r.e. classes, class-comprising 
explanatory learning is equivalent to uniform explanatory learning; the latter 
means that one can construct a numbering of partial-recursive learners Mg, Mi, 
M 2 ,... such that for any given r.e. nnmbering FIq, Hi, F 12 , ■ ■ ■ of the target class 
C with IFe = {{d,x) : X G Hd}, the e-th learner explanatorily learns C with 
respect to {Hq, Hi,H 2 , ...}. In particular, uniformly r.e. explanatorily learnable 
classes are always explanatorily learnable with respect to a class-preserving hy¬ 
pothesis space. The next theorem shows, however, that none of the approximate 
learning criteria considered so far can be combined with class-preservingness. 
Thus, in general, any successful approximation of languages must involve sets 
not contained in the target hypothesis space. An intnitive explanation for this 
is that a class-preserving learner may be incapable of recursively deciding, for 
any given finite set D, whether there exists a language in the target class that 
agrees with the current input on D. 

Theorem 23. There is a uniformly r.e. class that is Approx learnable but not 
ClsPresvFinApprox learnable. 

Proof. Let Mg, Mi, M 2 ,... be an enumeration of all partial-recursive learners. 
For each e, define a strictly increasing r.e. sequence Xe,i, Xe, 2 , • • • as follows. 
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First, for any given finite set D and number y ^ D, let aD,y denote the string 
lo4o.. .o3j/ + l, which is a concatenation (in increasing order) of all numbers of 
the form 3z +1 with 0 < z < y and z ^ D. Xe,i is defined to be the first number 
found (if such a number exists) such that for some me,i with Xe,i > TOe,i, it 
holds that {3e, 3xe,i + 1} C LFMelseoa® „ Suppose that Xe,i, ■ ■ ■ ,Xe,k have 
been defined. Xe,k+i is then defined to be the first number found (if such a 
number exists) such that for some me,k+i with Xe,k+i > We,fc+i > Xe,k, it holds 
that {3e, 3xe, fc+i + 1} C Wm^ (3eoa.{3.^ k+i^' 

For each pair (e,i), define according to the following case distinction. 

Case (1): x^^i is defined for all i. Set A(e,o> = {e} © (N — {xe,i '■ i € N}) ©0. For 
each j > 0, set F(ey) = {e} © (N — {xey : i < j}) © {0}. 

Case (2): There is a minimum I sueh that Xe,i is undefined. Set i(e,o> = {e} © 

{{y : {I = I ^ y < 0) A {I > I ^ y < Xe,i-i)} - {xe,i : i < 1}) ® For 

each j with 1 < j < I — 1, set F(e,j> = {e} © (N — {xe,i ■ i < j}) © {0}. Set 

L{e,i) = {e} © (N — {xe,i : i < Z}) © 0. For each j > I + 1, set F(ey) = 0- 

Set C = {F(ey) '■ e,i € N}. 

Now it is shown that C is approximately learnable with respect to a class¬ 
comprising hypothesis space. On input ct, the learner M outputs a canonical 
index for 0 if range{a) does not contain any multiple of 3. Otherwise, let e be 
the minimum number such that 3e € range{a); M then checks whether or not 
2 G range{a). If 2 G range{a), M searches (with computational time bounded 
by |cr|) for the least I (if such an I exists) such that 3xe,i + 1 G range{a)] it then 
conjectures Li^f., 1 )- If no such I exists, M conjectures If 2 ^ range{a), M 

searches for the minimum V such that Xe,i> has not yet been defined at stage \a\. 
If Sxe.z' + I ^ rangeiyj), then M conjectures F(e,o)■ If 3xe,i' +1 G range{a), then 
M outputs an index d such that 


{ range{a) U {3z + I : (I' = 1 ^ 0 < 2 ; < s) if s > Xe,i' is the first step at 
A(/' > 1 ^ Xe,i>-i + l < z < s)} U i(e,o) which Xe,i' is defined; 
range{a) U {3z + I : (F = 1 ^ 2 : > 0) if Xe,i> is undefined. 

A(/' > 1 ^ 2 ; > Xe,l'-1 + I)} 


For the verification that M approximately learns C, suppose that M outputs 
the sequence of conjectures 60 , 61 , 62 ,... on text T. Assume first that Xe,i is 
defined for all i. If T is a text for F(e,o>) then for almost all n, IFe„ is a finite 
variant of i(e,o); furthermore, if Cjg, ... is the subsequence of conjectures for 
which Wej. i(e.o)) then the sequence yo,yi,y 2 ,... of minimum numbers such 
that We^. (yi) L(e.o)(j/i) is almost always monotone increasing and contains a 
strictly increasing subsequence. In addition, for almost all i, Wefiy) = L(e,o)(j/) 
for all y contained in F(e,o>) which is an infinite set. Hence M approximately 
learns i(e,o). If T is a text for L(e,j> for some j > 0, then 2 G range{T) and so M 
will eventually identify j as the minimum I such that 3xe,i +1 G range{T). Thus 
M will converge to an index for Next, assume that there is a minimum 

I such that Xe,i is undefined. If T is a text for then M will in the limit 

identify I as the minimum I' such that Xe,i' is undefined; thus, as 3xe,i' + 1 ^ 
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range(T), M on T will converge to an index for If T is a text for some 

nonempty i(e,j> with j > 1, M on T will again converge to an index for Li^g^y. 
if 2 € then M will eventually identify j as the minimum number I such 

that 3xe,i + 1 € range{T) and converge to indices for L(^f.yy, if 2 ^ ^{e,j)) then 
ixej + 1 G range(T) and the fact that j is the minimum number for which Xej 
is undefined together imply that M on T will converge to indices for Li^^ jy By 
construction, M converges to a canonical index for 0 on any text with an empty 
range. This completes the verification that M approximately learns C. 

It remains to show that C is not FinApprox learnable using a class-preserving 
hypothesis space. Assume that Me ClsPresvFinApprox learns C. If there is a 
minimum I such that Xe,i is undefined, then there is a text U for T(e,i) on which 
Me almost always outputs a conjecture that is different from L(^e,i)- Since Me 
finitely approximates L(^e,i), almost all of Mg’s hypotheses on U must contain 
3e. But for all j > 0 such that j ^ I, either T(e,j) = 0 or 2 € T(e,j>- As 
2 ^ A(e,i> and T(e,i) is infinite, while T(e,o) is finite, it follows that Mg, being a 
finitely approximate learner, must almost always conjecture a set different from 
any L(^e,j) with j ^ 1. Hence Mg is not a finitely approximate learner of T(e,z>- 
Suppose, on the other hand, that Xe,i is defined for all i. Then one can build a 
text U' for Li^efi') on which Mg infinitely often conjectures a set containing 2; but 
since 2 ^ Li^g ^y it follows that Mg does not finitely approximately learn T(e,o>- 
This establishes that C is not ClsPresvFin Approx learnable. | 

The main content of the following proposition may be summed up as follows: 
the quality of the hypotheses issued by a HG* learner may be improved so that 
for any given finite set D, the learner’s hypotheses will eventually agree with the 
target language on D. 

Proposition 24. If C is BC* learnable, then C is FinApproxBC* learnable. 

Proof. Given a BC* learner M of C, one can make a new learner N as follows. 
On input a, N conjectures range{a) U (Hm(ct) C {z : z > |cr|}). Suppose that 
N is fed with a text T for some L G C. N is a. BC* learner because it always 
conjectures finite variants of M’s conjectures. Furthermore, for every finite set 
D there is some sd such that sd > max(II) and range{T[s]) H D = L H D 
for all s > s_D. It follows by construction that for all s > sd, Wn(^t[s]) C D = 
range{T[s]) Ci D = L Ci D, and so N finitely approximately BC* learns L. | 

The next two results consider combinations of finite approximation and some 
learning models that permit finitely many anomalies. It is readily seen that the 
additional constraint of finite approximation implies that any anomaly in the 
learner’s hypotheses will eventually be corrected. 

Proposition 25. If C is Vac*FinApprox learnable, then C is Vac learnable. 
Proposition 26. If C is Ex*FinApprox learnable, then C is Ex learnable. 

As a side remark, ConsvPartBC learning is only as powerful as ConsvEx learn¬ 
ing; the following proposition establishes this fact. 
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Proposition 27. IfC is ConsvPartBC learnable, thenC is PrudConsvEx learn- 
able. 

Proof. Note that on any text for some L € C, a ConsvPartBC learner M 
outputs exactly one index e with We = L; since M is also a BC learner, this 
means that M on T converges to e and it never outputs a proper superset of L. 
By [8, Theorem 29] and [7, Theorem 10], C is PrudConsvEx learnable. | 


6.2 Weakly Approximate, Approximate and BC* Learning 

The next proposition shows that Theorem 20 cannot be improved and gives a 
negative answer to the question whether partial or consistent partial learning 
can be combined with weakly approximate learning. 

Proposition 28. The uniformly recursive class {A : A = N or A contains all 
even and finitely many odd numbers or A contains finitely many even and all 
odd numbers} is (a) ConsWeakApprox learnable and (b) ConsPart learnable, 
but not WeakApproxPart learnable. 

Proof. That (a) can be satisfied follows from Theorem 13; that (b) can be 
satisfied follows from [8, Theorem 18]. Furthermore, one can easily make a text 
T which makes sure that a given partial learner M for the class does not also 
weakly approximate it. The idea is to define the text T inductively as follows by 
going through the following loop: 

1. Let n = 0; 

2 . As long as M (^[n]) does not conjecture a set which contains all even numbers 
and only finitely many odd numbers let T{n) be the least even number not 
yet in the text and update n = n + 1; 

3. As long as M(T\n]) does not conjecture a set which contains all odd numbers 
and only finitely many even numbers let T{n) be the least odd number not 
yet in the text and update n = n + 1; 

4. Go to Step 2. 

It is easy to see that as the learner is partial it cannot get stuck in Step 2 or Step 
3 forever, as it would not output an index for range(T) infinitely often in that 
case. Hence it alternates between Steps 2 and 3 infinitely often and will therefore 
alternating between sets containing all even and only finitely many odd numbers 
and all odd and only finitely many even numbers. Hence there is no infinite set 
which is contained in almost all hypotheses; however, the range of T is the set 
of natural numbers and thus the learner is not weakly approximating it. | 
The next theorem shows that neither partial learning nor consistent partial learn¬ 
ing can be combined with approximate learning. In fact, it establishes a stronger 
result: consistent partial learnability and approximate learnability are insufficient 
to guarantee both partial and weakly approximate learnability simultaneously. 
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Theorem 29. There is a class of r.e. sets with the following properties: 

(i) The class is not BC* learnable; 

(a) The class is not WeakApproxPart learnable; 

(Hi) The class is Approx learnable; 

(iv) The class is Ex[K'] learnable. 

(iv) The class is ConsPart learnable. 

Proof. The key idea is to diagonalise against a list Mq, Mi, ... of learners which 
are all total and which contains for every learner to be considered a delayed 
version. This permits to ignore the case that some learner is undefined on some 
input. 

The class witnessing the claim consists of all sets Ld such that for each d, 
either Ld is {d,d+l,...} or Ld is a subset built by the following diagonalisation 
procedure: One assigns to each number x > d a level £{x). 

— If some set Ld,e = {x > d : £{x) < e} is infinite then let Ld = Ld^e for the 
least such e and Md does not partially learn Ld 

— else let Ld = {d, d + 1,...} and Md does not weakly approximate Ld. 

The construction of the sets is inductive over stages. For each stage s = 0,1, 2,...: 

— Let Te be a sequence of all x € {d, d + 1,..., d + s — 1} with £{x) = e in 
ascending order; 

— If there is an e < s such that e has not been cancelled in any previous 
step and for each p ^ Te the intersection WM,i(ToTi...Te-ia),s O {y : d < y < 
d + s A £{y) > e} contains at least |Te| elements 

• Then choose the least such e and let £{d + s) = e and cancel all e' with 
e < e' < s 

• Else let £{d + s) = s. 

A text T = lime <Te is defined as follows (where (Tq is the empty sequence): 

— Let Te be the sequence of all x with £{x) = e in ascending order; 

— If (Te is hnite then let Ge+i = <XeTe else let (Te+i = ere- 

In case some Ue are inhnite, let e be smallest such that Ue is inhnite. Then T = Ue 
and Ld = Ld^e and T is a text for Ld. As is infinite, one can conclude that 

Vry ^ creVc[|WM,j(rori...Te_i^) H {y : £{y) > e}\ > c] 

and thus Md outputs on T almost always a set containing infinitely many ele¬ 
ments outside Ld; so Md does neither partially learn Ld nor BC* learn Ld. 

In case all ae are finite and therefore all Ld,e are finite there must be infinitely 
many e that never get cancelled. Each such e satisfies 

3p P Te [IEM^(roTi...Te_i»7) H {y : £{y) > e} is finite] 

and therefore e also satisfies 3p ^ Te [WM,i(ToTi...Te:-iri) finite]. Thus Md outputs 
on the text T for the cofinite set Ld = {d, d -|- 1,...} inhnitely often a finite set 
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and Md is neither weakly approximately learning Ld (as there is no infinite set 
on which almost all conjectures are correct) nor _BC'*-learning Ld- Thus claims 
(i) and (ii) are true. 

Next it is shown that the class of all Ld is approximately learnable by some 
learner N. This learner N will on a text for Ld eventually find the minimum 
d needed to compute the function Once N has found this d, N will on each 
input a conjecture the set 

WM(a) = {x : X > max(range(a)) W 3y € range{a) {t{x) < (.{y)]} 

In case Ld = Ld^e for some e, Ld^e is infinite, and for each text for Le.d) almost all 
prefixes cr of this text satisfy max{£(j/) : y € range(a)} = e and Ld^e Q So 

almost all conjectures are correct on the infinite set Ld itself. Furthermore, IFjv(cr) 
does not contain any x < max(range(a)) with £(x) > e, hence N eventually 
becomes correct also on any x ^ Ld,e and therefore N approximates Ld,e = Ld- 

In case Ld = {d,d + 1,...}, all Ld,e are finite. Then consider the infi¬ 
nite set S = {x : \/y > x[£{y) > £{x)]}. Let x G S and consider any cr 
with min(range(cr)) = d. If a; > TLnax{range{a)) then x € IF;v(ct)- If a; < 
max(range(cr)) then £(max(range(a))) > £{x) and again x G 11^(0-). Thus 
IFjv(cr) contains S. Furthermore, for all a; > d and sufficiently long prefixes a 
of the text, £{max{range{a))) > £{x) and therefore all x G WM{a) for almost all 
prefixes cr of the text. So again N approximates Ld- Thus claim (iii) is true. 

Furthermore, there is a iL'-recursive learner O which explanatorily learns the 
class. On input cr with at least one element in range{a), the learner determines 
d = min(ron(/e(cr)). If there is now some e < |cr| such that is inhnite then O 
conjectures Ld^e for the least such e else O conjectures {d, d -|- 1,...}. It is easy 
to see that these hypotheses converge to the set Ld to be learnt: eventually the 
minimum of the range of each input is d. In the case that Ld = Ld^e for some e 
this e is detected whenever the input is longer than e and therefore the learner 
converges to Ld,e- In the case that all Ld^e are finite, the learner almost always 
outputs the same hypothesis for {d, d -I- 1,...}. Thus O is a Ex[K'] learner and 
condition (iv) is true. 

It remains to show that the class is ConsPart learnable. This follows from 
the fact that the class is a subclass of the uniformly recursive family U = 
[LeAe, dgN U {{d + X : X G N} : d G N}. To see that U is uniformly recur¬ 
sive, it may be observed from the construction of L^^d that for each d, £{x) is 
defined for all x > d; each of these values, moreover, can be calculated effec¬ 
tively. Thus one can uniformly decide for all d, e and y whether or not y > d and 
^{y) < e, that is, whether or not y G Le,d- Consequently, by [8, Theorem 18], the 
given class is consistently partially learnable, as required. | 

The next result separates ConsApproxPart learning from BC* learning. 

Proposition 30. The class C = {N} U {{0,..., e} U {2x : 2x > e} : e G N} is 
ConsApproxPart learnable hut not BC* learnable. 

Proof. Make a learner M as follows. On input cr, if range{a) — range{a') = {x} 
for some odd number x, then M outputs a canonical index for N. Otherwise, 
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M determines the maximum odd number d (if such a d exists) such that d G 
range{a), and outputs a canonical index for {y : y < d} U {2z : 2z > d}. 
If no such d exists, then M outputs a canonical index for the set of all even 
numbers. Note that M is consistent by construction. If M is fed with a text T 
for some set L = {0,..., e} U {2x : 2x > e}, then there is a least s such that 
{0,..., e} C range(T[s\). Thus for all s' > s, M will output a canonical index 
for L and so it explanatorily learns L. If M is fed with a text for N, then it will 
output a canonical index for N at all stages where a new odd number appears; 
that is, it will output a canonical index for N infinitely often. Furthermore, since 
M’s conjecture at every stage contains the set of all even numbers, and {0,..., /} 
is contained in almost all of M’s conjectures for every /, M is an approximate 
learner, which implies that it never outputs any incorrect index infinitely often. 
Hence M ConsApproxPart learns C. 

To see that C is not BC* learnable, note that if some learner N BC* learns 
N, then there is a cr G (N U {#})* such that for all r G (N U {#})*, 
is cofinite: otherwise, one can build a text T' for N such that N on T' outputs 
a coinfinite set infinitely often, contradicting the fact that N BC* learns N. If 
range(a) = 0, let d = 0; otherwise, let d = max(range(a)). Then one can extend 
cr to a text a o T” for L' = {0,..., d} U {2z : 2z > d}. By the choice of u, N 
on cr o T" almost always outputs a cofinite set, and so it does not even partially 
learn L'. Therefore C is not BC* learnable. | 

Remark 31. Note that ApproxBC* Part learning cannot in general be com¬ 
bined with consistency; for example, consider the class {df}, which is finitely 
learnable but cannot be consistently learnt because K is not recursive [8, Theo¬ 
rem 18]. 

While the preceding negative results suggest that approximate and weakly ap¬ 
proximate learning imposes constraints that are too stringent for combining with 
partial learning, at least partly positive results can be obtained. For example, 
the following theorem shows that ConsvPart learnable classes are ApproxPart 
learnable (thus dropping only the conservativeness constraint) by BC* learners. 
This considerably improves an earlier result by Gao, Stephan and Zilles [8] which 
states that every ConsvPart learnable class is also BC* learnable. 

Theorem 32. IfC is ConsvPart learnable then C is ApproxPart learnable by a 
BC* learner. 

Proof. Let M be a ConsvPart learner for C. For a text T for a language L G 
C, one considers the sequence cq, ei,... of distinct hypotheses issued by M; it 
contains one correct hypothesis while all others are not indices of supersets of 
L. For each hypothesis e„ one has two numbers tracking its quality: is the 

maximal s < n + t such that all T(u) with u < s are in II4„,„+t U {#} and 
any = 1 + max{6m,t : m < n}. 

Now one defines the hypothesis set dde„,cr for any sequence cr. Let Cnp, e„^i,... 
be a sequence with e„_o = e„ and en,u be the Cm for the minimum m such that 
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m = n or We^ has enumerated all members of range{a) within u + t time steps. 
The set He„ , 0 - contains all x for which there is a tt > a; with x S L14„ „. 

An intermediate learner O now conjectures some canonical index of a set 
iLe„,cr at least k times iff there is a f with a = T(0)r(l)... T(a„,t) and bn^t > k. 
Thus O conjectures infinitely often iff We„ contains range(T) and an,t = 

\a\ for almost all t. 

If e„ is the correct index for the set to be learnt then, by conservativeness, 
the sets We^ with m <n are not supersets of the target set. So the values bm,t 
converge which implies that converges to some s. It follows that for the 
prefix cr of T of length s, the canonical index of is conjectured infinitely 

often while no other index is conjectured infinitely often. Thus O is a partial 
learner. Furthermore, for all sets conjectured after a„,t has reached its 

final value s, it holds that the em,u in the construction of He^,T converge to e„. 
Thus He^,T is the union of IFe„ and a finite set. Hence O is a BC* learner. To 
guarantee the third condition on approximate learning, O will be translated into 
another learner N. 

Let do,di,... be the sequence of O output on the text T. Now N will copy 
this sequence but with some delay. Assume that N{ak) = dk and ak is a prefix of 
T. Then N will keep the hypothesis dk until the current prefix ak+i considered 
satisfies either range{ak+i) 2 range(ak) or 7 ^ range{ak+i)- 

If range{T) is infinite, the sequence of hypotheses of N will be the same as 
that of O, only with some additional delay. Furthermore, almost all Wd„ contain 
range(T), thus the resulting learner N learns range(T) and is almost always 
correct on the infinite set range(T); in addition, N learns range(T) partially 
and is also BC*. If rangeiT) is finite, there will be some correct index that 
equals infinitely many dn- There is a step t by which all elements of range(T) 
have been seen in the text and enumerated into Wd„ ■ Therefore, when the learner 
conjectures this correct index again, it will never withdraw it; furthermore, it 
will replace eventually every incorrect conjecture due to the comparison of the 
two sets. Thus the learner converges explanatorily to range(T) and is also in 
this case learning range(T) in a BC* way, partially and approximately. From 
the proof of Theorem 18, one can see that N may be translated into a learner 
satisfying all the requirements of ApproxPart and i?C'*learning. | 

Example 33. The class {{e+d : d G N} : e G N}U{{e+d : e G K—Kd} : e G N} 
is Ex learnable and hence ApproxBC*Part learnable, but it is not ConsvPart 
learnable [6, Theorem 29]. 


Case and Smith [4] published Harrington’s observation that the class of recursive 
functions is BC* learnable. This result does not carry over to the class of r.e. 
sets; for example. Gold’s class consisting of the set of natural numbers and all 
hnite sets is not BC* learnable. In light of Theorem 7, which established that 
the class of recursive functions can be BC* and Part learnt simultaneously, it 
is interesting to know whether any BC* learnable class of r.e. sets can be both 
BC* and Part learnt at the same time. While this question in its general form 
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remains open, the next result shows that BC"' learning is indeed combinable 
with partial learning. 

Theorem 34. Let n If C is BC'^ learnable, then C is Part learnable by a 
BC'^ learner. 

Proof. Fix any n such that C is iJC" learnable. Given a recursive BG" learner 
M of C, one can construct a new learner Ni as follows. First, let Fq, Fi, F 2 ,... 
be a one-one enumeration of all finite sets such that |Fi| < n for all i. Fix a text 
T, and let Cq, Ci, 62 ,... be the sequence of M’s conjectures on T. 

For each set of the form We^ U Fj (respectively We^ — Fj), Ni outputs a 
canonical index for 144^ U Fj (respectively We^ — Fj) at least m times iff the 
following two conditions hold. 

1. There is a stage s > j for which the number of distinct x < j such that 
either x G fFe,,® A x ^ range(T[s + 1]) or a; G range{T[s -I- 1]) A x ^ W^ei,s 
holds does not exceed n. 

2. There is a stage t > m such that for all x < m, x G U Fj iff x G 

range{T[t + 1]) (respectively x G Wg^y — Fj iff x G range{T[t -|- 1])). 

At any stage T[s -|- 1] where no set of the form Wg^ U Fj or We^ — Fj satisfies the 
conditions above, or each such set has already been output the required number 
of times (up to the present stage), Ni outputs M(T[s -|- 1]). 

Suppose T is a text for some L G C. Since M is a BC^ learner of C, it 
holds that for almost all i, there are at most n x’s such that Wg^(x) 7 ^ L(x). 
Furthermore, for all j such that We^{x) 7 ^ L{x) for at least n + 1 distinct x’s, 
there is an I such that for all I' > I, neither Wg^ U Fp nor Wg^ — Fi' will satisfy 
Condition 1.; thus, for any set S such that S(x) L(x) for more than n distinct 
values of x, Ni will conjecture S only finitely often. On the other hand, if there 
are at most n distinct x’s such that We^{x) 7 ^ L{x), then there is some I such 
that either L = Wg^ U F; or L = Wg^ — Fi; consequently, either Wg^ U Fi or 
Wg^ — Fi will satisfy Conditions 1. and 2. for infinitely many m. Hence A^i is a 
BG" learner of L and it outputs at least one correct index for L infinitely often 
on any text for L. Using a padding technique, one can define a further learner 
N that BC^'Part learns C. | 

Theorems 35 and 38 show that partial BC* learning is possible for classes that 
can be BC* learned by learners that satisfy some additional constraints. 

Theorem 35. Assume that C is BC* learnable by a learner that outputs on 
each text for any L G C at least once a fully correct hypothesis. Then C is Part 
learnable by a BC* learner. 

Proof. Let M be given and on a text F, let eo, ei,... be the sequence of hypothe¬ 
ses by M. Now one can make a learner O which on input F(0)T(1).. .T{n), hrst 
computes Cq, ei,..., and then computes for every Cm the quality gm,n which 
is the maximal number y < n such that for all x < y the number x has been 
enumerated into M4,n iff x G {F(0), r(l),..., T{n)}. In each step the learner O 
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outputs either the hypothesis for the least m such that either (a) Cm has been 
output so far less than qm,n times or (b) all fc < n satisfy that has been output 
qk,n times and qk,n < qm,n- One can see that false hypotheses get output only 
finitely often output while at least one correct hypotheses gets output infinitely 
often; as all but finitely many hypotheses of M are finite variants of L, the same 
is true for the modified learner O. By applying a padding technique, O can be 
converted to a learner N which is at the same time a BC* learner and a partial 
learner. | 

The next definition gives an alternative way of tightening the constraint of BC* 
learning. 

Definition 36. Let C be a class of r.e. sets. A recursive learner M is said to 
Vac* learn C iff M outputs on any text T for every L G C only finitely many 
indices, and for almost all n, WM{T[n+i\) is a finite variant of L. 

Example 37. Case and Smith [4] showed that Vac* and Ex* learning of recur¬ 
sive functions are equivalent. However, this equivalence does not extend to all 
classes of r.e. sets. Take, for example, the class C = {{e}©N : e G N}U{{e}©{a; : 
X < \We\} : e G N}. C is Vac learnable: on any input a whose range is of the 
form {e} © D, determine whether max(D) > |lTe,|(T| I; if so, conjecture {e} © N; 
otherwise, conjecture {e} © {a; : a; < |H4|}- If range{a) does not contain any 
even number, conjecture range{a). 

On the other hand, C is not Ex* learnable. Assume by way of a contradiction 
that a recursive learner M Ex* learns C. Using K as an oracle, one can determine 
for any e whether We is finite. By the assumption that M is an Ex* learner, one 
can enumerate a text T for Le = {e} (B {x : x < \We\} until at least one of the 
following holds. 

1. There is some m such that for all a; > m, a; ^ We- This immediately implies 
that We is finite. 

2. For some a G {Le U {#})* such that cr is a prefix of T, it holds that for all 
T] G {Le U {#})*, M{ari) = M{a)] in other words, cr is a locking sequence for 
Le. 

Now one can use K again to determine whether or not there exists an 77 G 
({e}©N)* such that M{ar]) ^ M{a). Suppose that |H4| is finite. Then {e}©N 
is not a finite variant of Le', furthermore, as M must Ex* learn {e} © N, there 
must exist some g G ({e}©N)* for which M{ar]) ^ M{a). Suppose, on the other 
hand, that |H4| is infinite. Then Lg = {e} © N, so that by the locking sequence 
property of cr, M{avi) = M{a) for all rj G ({e}©N)*. Hence the Ex* learnability 
of C would imply that {e : \We\ < 00} is Turing reducible to K, which is known 
to be false [20]. | 

Theorem 38. Suppose there is a recursive learner that BC* learns C and out¬ 
puts on every text for any L G C at least one index infinitely often. Then C is 
BC*Part learnable. 
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Proof. Let M be a recursive BC* learner of C such that M outputs on every 
text for any L G C at least one index infinitely often. Define a learner as 
follows. 

On any given text T for some L gC, let e„ = M{T[n + l]). Let Fq, Fi,F 2 , ... 
be a one-one enumeration of all finite sets. On input T[k -b 1], Ni outputs a 
canonical index de^,i for We^ U Fi (respectively ge^,i for We^ — Fi) at least m 
times iff the following conditions hold: 

1. M outputs Cfc at least I + 1 times; 

2. there is a stage s > m such that for all a; < to, a; S range{T[s + 1]) iff 

X G Wet.s 0 Fi (respectively x G range{T[s + 1]) iff a; G — Fi). 

It will be shown that Ni has the following two learning properties: first, it BC* 
learns C; second, it outputs at least one correct index infinitely often; third, it 
outputs an incorrect index only finitely often. Consider any e^. 

First, suppose that We^ is not a finite variant of L. Then M outputs ek only 
finitely often. Further, will consider sets of the form We^ U Fi or We^. — Fi 
for only finitely many Fj. Since, for each such We^ U Fi (or We,. — Fi), item 2. 
will be satisfied for only finitely many to, it follows that will conjecture a set 
of the form We,, U F; or We^, — Fi only finitely often. 

Second, suppose that We,, is a finite variant of L. Then for any F;, We,, U Fi 
and FFefc — Fi are both finite variants of L. Hence preserves its BC* learning 
property by outputting any indices for We^, U Fi or We,, — Fi. Moreover, M 
outputs infinitely often at least one index Ch such that Wg,, is a finite variant 
of F. If F = We,, U Fc (respectively F = Wg,, — Fg) for some Fg, then N will 
consider Wg,, U Fg (respectively Wg,, — Fc) after M has output Ch at least c -b I 
times. As We,, UFc (respectively We,, — Fg) satisfies item 2. for almost all to, A^i 
will output at least one index for F infinitely often. 

Third, suppose that for some F;, neither Wg^. UF; nor We,, — Fi is equal to F. 
Then Wg^, U F; and We,, — Fi will satisfy Condition 2. for all but finitely many 
TO, and so A^i will output a canonical index for We,, UF; or Wg^, — Fi only finitely 
often. This establishes the three learning properties of Ni. 

Using a padding technique, one can define a further learner N such that N 
preserves the BC* learning property of iVi; further, if is the minimum index 
that A^i outputs infinitely often on T, then there is a h' with e'f, = et' such that 
N will output pad{e'f,,, dw) infinitely often, and every other index is output only 
finitely often. Therefore N is both a BC* and a Part learner of C. | 

Corollary 39. If a class C of r.e. sets is Vac* learnable, then C is BC*Part 
learnable. 

Example 40. Case and Smith [4] showed that the class of recursive functions 
F = {/ : / is recursive A V°°x[/ = is BC learnable but not Ex* learnable. 

By the equivalence of Ex* and Vac* in the setting of learning recursive functions, 
F is also not Vac* learnable. Furthermore, by Theorem 38, the class F witnesses 
the separation of Vac* and BC* Part learnability. 
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The following proposition shows that two relatively strong learning criteria can 
be synthesized to produce quite a strict learning criterion. 

Proposition 41. If a class C ofr.e. sets is Vac* WPart learnable, thenC is Vac 
learnable. 

Proof. Assume that M is a Vac* WPart learner of C. Define a new learner N 
as follows. On input cr, let eo,ei,...,efe be all the distinct conjectures of M 
on prefixes of a. For each e^, let pt be the maximum number such that for 
all X < Pi, X € bFe- |cr| holds iff x is contained in range{a). Furthermore, let 
q = max({pi : 0 < z < fc}) and m be the least index such that Pm = P', N then 
outputs Cm- 

Let do,... ,di be all the distinct conjectures of M on some text T for an 
L € C. Since M is a WPart learner, it must output at least one index for L. on 
T. Consider any di, dj such that Wd^ 7 ^ L and Wdj = L. Let Zi be the maximum 
number such that for all a: < x G Wdi holds iff x G L. Then on almost all text 
prefixes r[s], there must exist some pj > Zi such that for all x < yj, x G Wdj,8+i 
iff X is contained in range{T[s\) . As there are only finitely many incorrect indices 
that M outputs, it follows that N will almost always output some index dc for 
which Wd,, = L. Therefore is a Vac learner of C. | 

The following proposition implies that vacillatory learning cannot in general be 
combined with partial learning; in other words, a vacillatorily learnable class 
may not necessarily be vacillatorily as well as partially learnable at the same 
time. 

Proposition 42. If a class C of r.e. sets is Vac*Part learnable, then C is Ex 
learnable. 

Proof. If M is a recursive learner of C such that on any text T for some L G C, 
M outputs only finitely many indices and outputs exactly one index d for L 
infinitely often, then M almost always outputs d on T. | 

Example 43. The class of all cofinite sets is Ex* learnable (and hence Fac* 
learnable) but it is not Ex learnable. By Prop 42, this class is also not Vac*Part 
learnable. 

7 Conclusion 

This paper studied conditions under which various forms of partial learning can 
be combined with models of approximation and with BC* learning. For learning 
of recursive functions, it positively resolved Fulk and Jain’s open question on 
whether the class of all recursive functions can be approximately learnt and BC* 
learnt at the same time. For learning r.e. languages, three notions of approximate 
learning were introduced and studied. However, questions on the combinability 
of some pairs of learning constraints remain open. In particular, it is unknown 
whether or not every BC* learnable class of r.e. languages has a learner that is 
both BC* and Part. 
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